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This paper develops a point-mutation model describing the evolutionary dynamics of a population 
of adult stem cells. Such a model may prove useful for quantitative studies of tissue aging and the 
emergence of cancer. We consider two modes of chromosome segregation: (1) Random segregation, 
where the daughter chromosomes of a given parent chromosome segregate randomly into the stem 
cell and its differentiating sister cell. (2) "Immortal DNA strand" co-segregation, for which the 
stem cell retains the daughter chromosomes with the oldest parent strands. Immortal strand co- 
segregation is a mechanism, originally proposed by Cairns (J. Cairns, Nature 255, 197 (1975)), by 
which stem cells preserve the integrity of their genomes. For random segregation, we develop an 
ordered strand pair formulation of the dynamics, analogous to the ordered strand pair formalism 
developed for quasispecies dynamics involving semiconservative replication with imperfect lesion 
repair (in this context, lesion repair is taken to mean repair of postreplication base-pair mismatches). 
Interestingly, a similar formulation is possible with immortal strand co-segregation, despite the fact 
that this segregation mechanism is age-dependent. From our model we are able to mathematically 
show that, when lesion repair is imperfect, then immortal strand co-segregation leads to better 
preservation of the stem cell lineage than random chromosome segregation. Furthermore, our model 
allows us to estimate the optimal lesion repair efficiency for preserving an adult stem cell population 
for a given period of time. For human stem cells, we obtain that mispaired bases still present after 
replication and cell division should be left untouched, to avoid potentially fixing a mutation in both 
DNA strands. 
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I. INTRODUCTION 

The generation and maintenance of tissues in mammals 
is currently a topic of intense investigation by experimen- 
tal and theoretical biologists. Besides its intrinsic scien- 
tific interest, an understanding of tissue cell kinetics, ar- 
chitecture, and development has important implications 
for aging and cancer. 

In vertebrate animals, many tissues and organs are 
generated by what are known as adult (or equivalently, 
somatic) stem cells. Adult stem cells are rare, undiffer- 
entiated cells that divide asymmetrically to renew differ- 
entiated cells in adult tissues. They divide to produce 
the original stem cell, and a differentiating progeny cell. 
The differentiating progeny cell then proceeds through a 
series of division and differentiation steps (see Figure 1), 
to produce a large collection of mature tissue cells. 

At this point, it is not clear how adult stem cells 
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emerge in multicellular organisms, nor is it known how 
this method of generating tissue cells evolved. Neverthe- 
less, it is believed that this mechanism may serve to delay 
the emergence of cancer in mammals. 

Mature skin cells, for example, are continually regener- 
ated by adult stem cells. The tissue cells, after undergo- 
ing a prespecified number of divisions, cease dividing (a 
process known as terminal differentiation) , and are even- 
tually shed. Thus, any potentially cancerous mutation 
in differentiated skin tissue cells will eventually leave the 
body, thereby reducing the risk of skin cancer. 

In order to effectively reduce mutation rates, however, 
there must exist a mechanism or collection of mecha- 
nisms that protect the genetic integrity of the adult stem 
cell population. Otherwise, because adult stem cells are 
long-lived in the body, they will eventually accumulate a 
sufficient number of mutations to become cancerous, or 
become genetically inferior stem cells. 

One important mechanism by which adult stem cells 
protect the integrity of their genomes is through a form of 
asymmetric chromosome segregation during cell division, 
known as immortal DNA strand co-segregation. The 
immortal strand hypothesis was originally proposed by 
Cairns DJ . It states that when an adult stem cell divides 
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FIG. 1: (Color online) Generation of differentiated tissue cells 
(green) from an adult stem cell (blue). 




FIG. 2: (Color online) Illustration of immortal DNA strand 
chromosome segregation. 



to form a stem cell and a differentiating tissue cell, the 
stem cell retains the chromosomes with the oldest DNA 
strands of the genome (see Figure 2). Presumably, the 
oldest DNA strands of the genome provide the most ac- 
curate template for daughter strand synthesis, and hence 
their preferential segregation into the adult stem cells en- 
sures optimal maintenance of stem cell genetic integrity 
and overall tissue health. 

The immortal strand mechanism was recently con- 
firmed experimentally 0, The confirmation of this 



segregation mechanism has motivated the authors to de- 
velop a mathematical model describing the evolutionary 
dynamics of a population of adult stem cells. 

We are interested in three aspects of stem cell evolu- 
tionary dynamics: First of all, we seek to develop a set of 
ordinary differential equations describing the evolution- 
ary dynamics of a population of adult stem cells. This is 
done in the following section. For simplicity, we assume 
an infinite population, continuous time model. While 
strictly speaking this is not correct, stochastic simula- 
tions show good agreement already at populations with 
as few as 10, 000 stem cells. 

Second, we wish to rigorously show that immortal 
strand co-segregation is necessary to preserve the stem 
cell lineage. Immortal strand co-segregation can only 
provide an advantage, however, if, during a process 
known as lesion repair, not all postreplication DNA mis- 
matches are corrected Otherwise, daughter-strand syn- 
thesis errors can become fixed as mutations in both par- 
ent and daughter strands, thereby eliminating the advan- 
tage of keeping the oldest template strand in the stem cell 
1^ 1^ & ilj ' 

Finally, because a high lesion repair efficiency reduces 
the overall mutation rate, while low lesion repair effi- 
ciency preserves the information in the parent strand, 
there is an optimal lesion repair efficiency for maximally 
preserving the stem cell lineage for a given period of time. 
In our case, the period of time of interest is a human life- 
time, which we take to be on the order of 80 years. 

In the following section, we derive the finite sequence 
length equations describing the evolutionary dynamics 
of adult stem cells, for the cases of random segregation 
versus immortal strand co-segregation. In particular, we 
develop an ordered strand pair formulation of the dy- 
namics, analogous to the ordered strand pair formula- 
tion of the quasispecies equations for semiconservative 
replication with imperfect lesion repair 0, 0, For 
random segregation, the equations derived are similar to 
the corresponding quasispecies equations. For immortal 
strand co-segregation, the equations are qualitatively dif- 
ferent. Nevertheless, despite the age-dependence of the 
chromosome segregation mechanism, for immortal strand 
co-segregation it is still possible to develop an ordered 
strand pair formulation of the dynamics. 

In Section III, we derive the infinite sequence length 
form of the evolutionary dynamics equations, for a class 
of fitness landscapes defined by a master genome. These 
equations are analogous to the equations developed for 
semiconservative replication with imperfect lesion repair 
0. We then proceed to obtain the system of differential 
equations governing the decay of the stem cell population 
with the master-genome genotype. 

We continue in Section IV, where we use the master- 
genome equations to determine the optimal lesion repair 
efficiency for preserving the stem cell lineage for a given 
amount of time. In particular, we show that lesion repair 
should be turned off in stem cells. That is, postreplica- 
tion DNA mismatches should be left uncorrected in stem 
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cells. 

We conclude in Section V with a summary of our re- 
sults, and plans for future research. 



II. DERIVATION OF THE FINITE SEQUENCE 
LENGTH EQUATIONS 

A. Definitions 

We consider a population of Ns replicating adult stem 
cells. As is illustrated in Figure 1, each of these stem 
cells generates a lineage of differentiated tissue cells. 

We assume that each stem cell has a genome consist- 
ing of a single, double-stranded DNA molecule. A given 
genome may then be given by the set {a, a'}, where a 
and cr' denote the two strands. In principle, DNA con- 
sists of two antiparallel, complementary strands. Thus, 
a genome of length L should consist of the strands a 
and its complement a, where a = 61... 6^ 4^ a ~ 

. . .hi [bi denotes the complement of bi. For the four 
bases used in DNA, complementary is defined by the 
Watson-Crick pairs Adenine: Thymine (A:T) and Gua- 
nine:Cytosine (G:C). See Figure 1 in Q). However, due 
to mutations, it is possible that the two strands of a given 
genome are not perfectly complementary, and so we have 
to relax this restriction. 

We also assume first-order growth, so that with each 
genome {a, a'} is associated a first-order growth rate con- 
stant K{a.a'}- The collection of all first-order growth rate 
constants is known as the fitness landscape. For simplic- 
ity, we assume in this paper a static, or time-independent 
landscape. 

As with all cells with double-stranded DNA genomes, 
we assume semiconservative replication, where the 
genome of each cell unzips to form two strands, each of 
which serves as a template for the synthesis of the com- 
plementary daughter strands. The end result is two new 
daughter genomes, one of which is retained by the stem 
cell, while the other becomes the genome of the differen- 
tiating sister. When genome {cr, cr'} replicates, then we 
assume that with daughter strand synthesis is associated 
a per-base mismatch probability of e^cr,cr'}- 

After replication is complete, and stem cell division 
has occurred, there may still be some errors in the daugh- 
ter strands which were missed by various error-correction 
mechanisms (DNA polymerase proofreading and mis- 
match repair). These mismatches result in lesions along 
the DNA chain, which may be recognized and repaired 
by various maintenance enzymes in the cell. It should 
be noted that in this case, the cell cannot distinguish 
between parent and daughter strands (which it does dur- 
ing daughter strand synthesis). Thus, a given error in the 
daughter strand has a 50% probability of being corrected, 
but it also has a 50% probability of being communicated 
to the parent strand. When this happens, the mutation is 
said to be fixed in the genome. Lesion repair is generally 
not perfect, and so we assume that when genome {cr, a'} 



replicates, a postreplication mismatch in the resulting 
daughter genomes is repaired with probability Ajo-.o-'}- 

Errors during daughter strand synthesis and lesion re- 
pair result in a probability distribution for the possible 
daughter genome which can be generated from a given 
parent strand. Thus, we define p{{a",cr"'), {a, cr'}) to be 
the probability that parent strand cr" , as part of genome 
{cr",cr"'}, forms the daughter genome {cr, ct'}. 

We may also note that ct" can form {ct, ct'} by either 
becoming ct, with daughter strand ct', or ct', with daugh- 
ter strand ct. The probability of the former process 
is denoted by p((ct", ct'"), (ct, ct')), and the probabihty 
of the latter process is denoted by p((ct", ct'"), (ct', ct)). 
Note that if ct 7^ ct', then p((ct", ct'"), {ct, ct'}) = 
p((ct",ct'"),(ct,ct')) + p((ct",ct'"),(ct',ct)), while 
p((ct", ct'"), {ct, ct}) = _p((ct", ct'"), (ct, ct}). An expression 
for p((ct",ct"'), (ct, ct')) was derived in 7]. 

Finally, because stem cell division (more properly, 
asymmetric self-renewal) results in a constant value for 
Ns, it is equivalent to look at population fractions. We 
therefore define x_^„ ,ji-j to be the fraction of the stem cell 
population (at a given time t) with genome {ct, ct'}. 

For immortal strand co-segregation, the preceding def- 
initions need to be somewhat modified, since we need to 
also keep track of the ages of the strands. To this end, 
we let CT^^-* denote a strand which has been the template 
(parent) strand at least once, while ct*^^-* denotes a strand 
which has never been the template for the synthesis of 
a daughter strand. For immortal strand co-segregation, 
then, we consider genomes of the form {ct'^^ct'W} and 
{ct^-'"^ ct'(^)}. We do not consider genomes of the form 
{ct^^\ ct'*-^'}, since, if our population initially consists 
of genomes which have never been involved in daughter 
strand synthesis, then such genomes can never appear in 
the population. The reason for this is that when a parent 
strand serves as the template for daughter strand syn- 
thesis, then it should be clear that the daughter strand 
automatically receives the "N" designation. Thus, two 
"T" strands can never be paired with one another. 



B. Random segregation 

For random chromosome segregation, each of the par- 
ent strands of a replicating genome has an equal proba- 
bility of becoming incorporated into the stem cell. The 
random segregation equations are then given by, 

^ ^'^{c,<y'}^{<y,<y'} 

+ \ ^{'y".T"'}X{a".a"'} X 

W'.a'"} 

[p((ct", ct'"), {ct, ct'}) + p((ct'", ct"), {ct, ct'})] 

(1) 

The term — K^^^^'j^^jo-.tr'} arises from the observation 
that, in semiconservative replication, the separation of 



4 



the parent strands corresponds to the effective destruc- 
tion of the original genome. The second term gives the 
rate at which {cr, cr'} is produced, due to replication and 
mutation, by all genomes in the population. The factor of 
1/2 arises because for random chromosome segregation, 
both parent strands cr" and cr'" of a replicating genome 
{ct", cr"'} have an equal probability of being retained by 
the stem cell. 

The above equations are fairly cumbersome for di- 
rect analysis, since the dynamics occurs over a space of 
double-stranded genomes. If the strands are completely 
correlated, so that in a genome {cr, a'} we always have 
a' — a, then following the derivation in Q, it is pos- 
sible to convert the dynamics over the space of double- 
stranded genomes into an equivalent dynamics over the 
space of single strands. This conversion is not possi- 
ble when the assumption of complementarity does not 
hold. Nevertheless, following the derivation in we can 
convert the dynamics over the space of double-stranded 
genomes into an equivalent dynamics over the space of 
ordered strand pairs. Specifically, given some genome 
{cr, ct'}, define. 



equations is given by, 



y{<y,cr') — Via', a) 



_ / \x{<y,a'} if cr 7^ cr' 



-{(T,cr'} 



if cr 



(2) 



Furthermore, define an ordered strand pair fitness land- 
scape via K(cr,CT') = '^((T'.cr) = '*{CT,cr'}. Thc raudom segre- 
gation equations then become. 



dt 



+ \ H '^{a",a'")V(a".a'") 

(<T".<T"') 

[p{{(j" ,a"'), (cr, cr')) +p((cr",cr'"), (cr',cr))] 

(3) 



C. Immortal strand co-segregation 



To derive the evolutionary dynamics for a stem 
cell population replicating with immortal strand co- 
segregation, we have to take into account the ages of 
the strands. In this case, we have to separately derive 
the dynamics for genomes where neither strand has been 
used as a template for daughter strand synthesis, and 
where one of the strands has been used as a template 
for daughter strand synthesis. The resulting system of 



Jt 

dt 



~«{o-,cr'}a;{o-(JV) ct'(N)} 
^'*{o-,<t'}2;{(j(t) ,^/(JV)} 

+ - ^ >^{a" ,a"'}X{a"{N) ,jnHN)} X 

{<T"(«),(j"'(«)} 

[p((a",a'"),(a,a'))-)-p((a'",a"),(a,a'))] 

l^{a" ,a"'}X{a"{T)^cHn(N)-^ X 



E 

{(T"(T),ct"'(")} 



p((a",a'"),(a,a')) (4) 

Note that genomes of the form {a'^^cr'^^)} cannot be 
produced via replication, since replication occurs via a 
parent strand which has then been used as a template 
for daughter strand synthesis at least once. 

Note also that when a genome {ct"^^\ cr'"*^^^} repli- 
cates, strands a" and cr'" have an equal probability of be- 
ing retained by the stem cell. Of course, when a genome 
replicates, then it is strand a" that is re- 
tained by the stem cell. 

Finally, note in the second equation that we are 
not considering probabilities p{{(j" ,a"'),{<7,(j'}), but 
rather probabilities p{{a" ,a"')T{a,a')). The reason for 
this is that in considering the production of genome 
{a^'^\a''^'^''}^ strand cr is explicitly marked as the tem- 
plate strand, while strand cr' is explicitly marked as the 
newly synthesized daughter strand. Therefore, to form 
{cr(^\ cr'(^)}, it is clear that the parent (template) strand 
ct" must become cr, with daughter strand cr'. 

As with the random segregation equations, we may 
define a equivalent dynamics over the space of ordered 
strand pairs. We do this in two steps. First, define, 

_ _ r ^xt^(N) ,j,(N)y \i a g' 

- y(<x'(«)..(«)) - \ if a = a' 



and 



a^{(T(«),(T'(«)} 



y(cr(T'),o-'(")) = a;{cr(T'),<T'(«)} 



(6) 



The ordered strand pair fitness landscape is defined as 
for random segregation. The result is the transformed 
system of equations, 



C^y(cr(«),cr'(«)) 
It 

'^2/(o-(T),cr'(«)) 

'dt 



~'«(o-,<t')2/(o-("),o-'(")) 
~«(<T,<j')?/(o-(T)^cr'(«)) 

+ E «((T",(j"')y(a"(«),(T"'(")) X 

(o-"("),(t"'(")) 

v{{<^" i^y'")-, (o-, cr')) 

+ ^ '«(ct",(t"')?/(ct"(t),o-"'<")) X 

(o-"(T)^<j"'(N)) 

p((a",a'"),(a,a')) (7) 



5 



The key equality to note in deriving the transformed dy- 
namics is, 

K{cr",cr"'}a;{cr"(«),cr"'(")} X 

{cr"(«),o-"'(")} 

[p{{a\a"'),{a,o'))+p{{a"',cj"),{a,o'))] 

=2 y: 

{o-"<Af),o-"'<«)},(T'V^"' 

+«(<T"',<T")y((T"'(«),£T"("))P((cr"'7Cr")7 (f^>cr'))] 

+2 ^ K(^",^//)?;(^„(iv)^<^„(iv))p((cr",cr"), (CT,cr')) 
{(t"(«),o-"(«)} 

= 2 ^ K(^//_<,///)?/(^„(iv)_^,„(jv))p((cr",cr"'),(o-,cr')) 
(ct"<"),<t"'(«)) 

(8) 

Finally, if we define y(a,a') = 2/(^(«),^/(iv)) +j/(^(t)^^(n)), 
then we obtain, 

= -«(o-,<T')y(<T,a') 

{a",a"') 

(9) 

Note that the ordered strand pair population fractions 
are defined somewhat differently for immortal and ran- 
dom chromosome segregation. For random chromosome 
segregation, the age of the strands is irrelevant to the 
division kinetics. Given a genome {u, cr'}, there is no 
canonical ordering of the strands a and cr'. If cr ^ cr', 
then the ordered pairs {<yT<j') and {a' ,a) should receive 
identical contributions from the genome {cr, ct'}. 

For immortal strand co-segregation, the above argu- 
ment holds for genomes of the form {(T'''^\a''^'^^}. How- 
ever, for genomes of the form {cr*^'^', cr'^^^'}, a canonical 
ordering of the strands exists. Namely, we place the older 
strand before the younger in the ordered strand pair rep- 
resentation. This means that, for immortal strand co- 
segregation, we may regard y[a,a') to be the total frac- 
tion of stem cells with template strand a and daughter 
strand cr'. The only potential problem with this inter- 
pretation is the inclusion of y((j(N) as part of this 
population fraction. However, this may be resolved by 
noting that while {(7'^'^\a''''^^} has not yet undergone a 
replication cycle, when it does, either cr*^^) or a'^^-' will 
be segregated into the original stem cell. Therefore, we 
may effectively preassign a "T" designation to either a 
or cr'. If cr = ct', then cr is the preassigned template 
strand for all genomes, while if cr 7^ cr', then cr is the pre- 
assigned template strand for half of the genomes. This 
interpretation for y^a.a') is consistent with the definition 
for y(a,cr') (l/2a;{^(N)^^/(iv)j, -|-a;|^(T) for cr ^ cr', and 

2^{ct("),(t(")} + 2;{cr(r)^CT(")} if cr = cr'). 

In contrast to random chromosome segregation, for im- 
mortal strand co-segregation it is not generally true that 



y(a' ,a) = y(cr,a')- The reasou for this is that in the case of 
(ct, ct'), a is the template strand which has been present 
through all stem cell divisions (though perhaps mutated 
to something different from the original strand). In the 
case of (cr', cr), it is a' that has remained in the stem cell. 
If a and cr' are different, there is no reason to expect an 
identical evolutionary pathway for the two strands, hence 
it is incorrect to assume that y{a,a') = 2/(0-', o-) • 



D. Equivalence of random and immortal strand 
co-segregation when lesion repair is perfectly 
efficient 



Under very general conditions, it is possible to show 
that when lesion repair is perfect, then random and 
immortal strand co-segregation yield identical stem cell 
dynamics. We need only make the following assump- 
tions: (1) For any ordered strand pair [a, a'), we have 
— «^(cr,(T')- (2) For any two ordered strand 
pairs {(TjCr') and {(T",a"'), we have p((cr", cr'"), (cr, cr')) = 
p(((7", ct'"), ((T, (t')). (3) For any ordered strand pair 

(cr,cr'), we have y{g^g') = y(a,a')- 

Because taking the complement of a strand essentially 
amounts to a relabelling of the bases and a change in 
the direction in which the strand is read, there is no 
reason to assume that conditions (1) - (3) should not 
hold in general. Indeed, cases where properties (1) - (3) 
do not hold indicate a strand asymmetry, a condition 
which results from specific, and presumably non-generic, 
base orderings. 

If we assume that the fitness and "mutation" land- 
scapes are chosen so that properties (1) and (2) are met, 
then if our population initially satisfies property (3) (ob- 
tained with a lesion-free population, for example), it is 
possible to show that property (3) holds for all time. The 
proof of this is similar to the proof of the analogous state- 
ment for quasispecies dynamics with imperfect lesion re- 
pair 0, and will therefore be omitted here. 

When lesion repair is perfect, then an initially lesion- 
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free population remains lesion free. In this case we have, 

(a',9') 

+ 2 ^ '«(ff',^')y(ff'.5')P((*''^'),(^,cr)) 

(10) 

which coincides with the immortal strand equations. 

III. THE "MASTER-GENOME" FITNESS 
LANDSCAPE 

A. Infinite sequence lengtli equations 

Following the derivation of the quasispecies equations 
with imperfect lesion repair 7] , we will now develop the 
infinite sequence length equations for a class of fitness 
landscapes defined by a "master" genome {uQ,aa\. For 
simplicity, we assume that ^{a.a'} ^-nd Ajo-.o-'} are genome 
independent, and may respectively be denoted by e and 
A. 

Following the convention used with quasispecies dy- 
namics, we derive the infinite sequence length equations 
with fi = Le held constant. This is equivalent to fixing 
the the genome replication fidelity, given by e~'^, in the 
limit of infinite sequence length. 

The derivation of the infinite sequence length equations 
from the finite sequence length equations for stem cell 
division parallels the derivation of the infinite sequence 
length equations for semiconservative replication with 
imperfect lesion repair. We therefore refer the reader 
to |3 for details. In this paper, we only provide the nec- 
essary definitions for understanding the final form of the 
infinite sequence length equations. 

To begin, we note that the "master" genome ((To,(To) 
gives rise to the ordered sequence pairs ((To,cto) a-nd 
(ctojCTo)- In the limit of infinite sequence length, the 



two master strands ctq and (Tq become infinitely sepa- 
rated from each other in Hamming distance, hence we 
may regard (cro,o-o) and ((To,(To) as infinitely separated 
from each other in the ordered sequence pair space. 

We may therefore group all sequence pairs (ti, a') into 
one of three classes: A sequence pair {a, a') is said to be 
of the first class if Z?if((T, co) and Z?/f (cr', (Jq) are both 
finite. A sequence pair (cr, a') is said to be of the second 
class if Dnicr, ao) and Dh{(t' , ctq) are both finite. Finally, 
a sequence pair not belonging to either one of the first 
two classes is said to belong to the third class. 

A given sequence pair (cr, cr') of the first class can be 
characterized by the four parameters, denoted Ic, II, Ir, 
and Ib- The first parameter, lc^ denotes the number of 
positions where cr and cr' are complementary, yet differ 
from the corresponding positions in crp and (Jq, respec- 
tively. The second parameter, 1^^ denotes the number of 
positions where cr differs from cro, but the complemen- 
tary positions in cr' are equal to the corresponding ones 
in (jQ. The third parameter, iij, denotes the number of 
positions where a is equal to the ones in crp, but the com- 
plementary positions in cr' differ from the corresponding 
ones in (Tq. Finally, the fourth parameter, Ib, denotes the 
number of positions where cr and cr' are not complemen- 
tary, and also differ from the corresponding positions in 
cro and (Tq. These definitions are illustrated in Figure 3 
of 0- A sequence pair of the second class may be simi- 
larly characterized (except erg and (Tq are swapped in the 
definitions given above). 

We assume that the fitness of a given sequence pair 
of the first class is determined by Ic, II, Ir, and Ib, 
hence we may write that K,(cr,(ji) = i^Hc-Il-Ir^b)- The 
fitness of a sequence pair (cr, cr') of the second class is 
determined by noting that (cr', cr) is of the first class, and 
that K(a,a') = Wc take the third class sequence 

pairs to be unviable, with a first-order growth rate of 1. 

We also assume that K(^Ic,Il,Ir,Ib) = HIc,Ir,Il,Ib)- This 
is a natural assumption to make if one assumes symmetry 
between the two master strands. In 0, we show that this 
assumption implies that K{a,a') = /'^(a.a')- 

We allow our system to come to equilibrium starting 
from the initial condition y{ao,so) = y{so,<7o) = 1/2- This 
initial condition corresponds to an initially mutation-free 
stem cell population. 

We may sum over the population fractions of all first 
class sequence pairs characterized by a given set of Ic, 
II, Ir, and Ib, and reexpress the quasispecies dynamics 
in terms of these quantities. We define Zfj^^i^ ij^ i^^) to 
be the total population fraction of first class sequence 
pairs characterized by Ic, II, Ir, and Ib- We similarly 
define z^i^^i^^^i^^i^^ to be the total population fraction of 
second class sequence pairs characterized by Ic, II, Ir, 
Ib- Following the derivation in 0, we then obtain. 
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dz, 



il c,lL,lR,0) 

dt 



XI X X Hl'l,lc-l'c-li>l'ifi)Z{VI,lc-l'c-l'{,l':i,0) 



for random segregation, and 



dz 



dt 



= -li{lcfi,lR,0)Z{lc,0,lR,0) 



1 1 \ °° 



(11) 



0)^(ic-'c'0''2'.0) 



(12) 



for immortal strand co-segregation. An analogous set of equations may be derived for the -2(ic,(i,,'fi,'B)- Using the fact 
that ?/(^,^,) = we have Z(^Ic,Il,Ir,Ib) = HIcIlMb)- 



r 



An interesting feature to note from comparison of these 
two equations is that for random chromosome segrega- 
tion, it is possible for II > 0, while for immortal strand 
co-segregation, we have II = 0. In the case of random 
segregation, the ordered strand pairs (cr, a') and {(t',(j) 
arc equivalent, hence we have ^c,,;^,;^,,^,) = Z(^i^^i^^i^^ig), 
which implies that Zij,^^i^^i^^i^) = z^i^^i^^i^^i^). In the 
case of immortal strand co-segregation, the first strand 
of the ordered strand pair represents the parent strand. 
Because the parent strand differs from ctq (or ctq when 
looking at the z equations) in only a finite number of po- 
sitions, in the limit of infinite sequence length the prob- 
ability that a mismatch occurs where the parent strand 
differs from (Tq is 0. Therefore, any lesions that occur will 
be due to an error made in the daughter strand, where 
the corresponding bases of the parent strand are identical 
to those of ctq. Thus, II remains 0, but Ir can become 
positive. 

Finally, from these equations it is possible to show that 
a population of adult stem cells will eventually degrade 
unless lesion repair is turned off and chromosome segre- 
gation occurs via the immortal strand mechanism. For 
random chromosome segregation, a given stem cell will 
periodically retain an erroneous daughter strand, result- 
ing in a steady degradation of the genome. For immor- 
tal strand co-segregation with nonzero lesion repair effi- 
ciency, mistakes in the daughter strands will periodically 
be communicated to the parent strand via lesion repair. 
The result is again a steady degradation of the genome. 



B. Decay of the master-genome population 

We may derive a set of differential equations describing 

the decay of the master genome population. We consider 
a fitness landscape where the viable genomes have a first- 



order growth rate constant kj^ , and the unviable genomes 
have a first-order growth rate constant k_ < k^. An 
ordered strand pair is taken to be viable if Ic < lc,maxi 
and if II + Ir + Ib < I- Thus, an ordered strand pair 
is viable if it has no more than lc,max fixed mutations, 
and no more than / lesions. Otherwise, the strand pair 
is unviable. 



Defining zo = ^(o,o,o,o), zi = E('=o- 



■(0,0,;' 



,0) , and Z2 



Ym^=o Z{o,o,i',o)j we obtain, for random segregation, that, 

^ = -k+zo + ie-''(i-V2)[(fc+ - k.)z, + k.z,] 
^ = -Mi + ^(l + /;(M,A))e-''(i-V2)x 



[(fc+ - k-)zi + k-Z2 

dz; 
~dt 

[{k+ — k-)zi + k-Z2 



^ = _(1 _ i(e-Mi-A/2) + e-''^/2)) 
dt 2 



(13) 



For immortal strand co-segregation, we obtain. 



dzp 
dt 



-k+zo + e 



-/x(l-A/2) 



[{k+ - k-)zi + k-Z2] 



d7 

^ = -fc+^i + Mil, A)e-''(i-V2) [(k+ - k_)z, + k_Z2] 



dZ2 

'dt 



(l_e-'^^/2)[(A:+-fc_)0i + fc_^2] 



(14) 



We may solve Eqs. (13) and (14) using standard nu- 
merical methods, for the initial condition zq = Zi = Z2 = 
1/2. This corresponds to an initial stem cell population 
consisting entirely of the master genome genotype. 

In Figure 3 we show a comparison of the numerical so- 
lution of Eqs. (13) and (14) with the results of stochas- 
tic simulations of dividing stem cells. The lesion repair 
probability A is taken to be 0.5 in this case. 
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Random segregation, theory 

RancJom segregation, simuiation + 

immortai strand co-segregation, theory 

immortai strand co-segregation, simuiation X 




Time (Dimensioniess) 

FIG. 3: Comparison of theory and simulation for a population 
of 10, 000 stem cells with genomes of sequence length 20. We 
assume fc+ = 10, k- — 1, n — 0.1, and I = 1. We iterated in 
time steps of length 0.001 out to a time of 10. 



IV. OPTIMAL LESION REPAIR 
PROBABILITIES 

We can use Eq. (14) to determine the optimal lesion 
repair probability for preserving the stem cell line out to 
a given time T. We use zq as our measure for the extent 
of the preservation of the stem cell line. The higher the 
value of Zq, the better the stem cell line is preserved. To 
this end, for simplicity, we also take fc_ = 0, i.e., we 
assume that unviable stem cells do not replicate at all. 
We also rescale the time by defining r ~ k+t. Wc then 
obtain. 



1 



//(m,a) 



[1 ^ f^^fi{n,X)exp{-ti{l-\/2))T _ -[^-jj 

(15) 

Therefore, maximizing za{T) is equivalent to maximizing 

gi{\-,tl,T) = (e/KP.A)exp(-Ml-A/2))T _ 1)//;(^,A). 

It is instructive to consider the behavior of gi for I — 
and ; = oo. For Z = 0, we have go = e<='^P(-''(i-^/2))^ - 1, 
which is clearly maximized for any fi and T when A = 1 . 
This makes sense because, when 1 = 0, then any lesion 
renders the stem cell unviable. Preserving the informa- 
tion in the parent strand by reducing the lesion repair 
efficiency does not help maintain the population of mas- 
ter genomes, since an unviable stem cell does not repli- 
cate. Therefore, in this case, it is optimal to make lesion 
repair maximally efhcient, thereby reducing the overall 
mutation rate away from the master genome. 

For imperfect lesion repair to allow for better preser- 
vation of the stem cell population within our model, we 
must therefore assume that ^ > 0. While typical values 
of I for cellular organisms are not available (the matter 
is also complicated by additional repair mechanisms such 
as SOS response), we may note that the smaller the value 
of /Lt, the fewer errors are made during replication (an av- 
erage of fj, are made). Thus, in practice, for small /i, one 



may assume that I = oo, since a large number of lesions 
will not be produced in any case (mathematically, this 
is equivalent to the observation that the series {fi{fi, A)} 
converges to fodfi, A) = e'^'^^"''*^ more quickly at smaller 
values of /i than at larger values of ^). Since cells have 
various error correction mechanisms which keep the over- 
all number of replication errors to on the order of 1 or less 
per replication cycle, the assumption that I = oo seems 
to be a reasonable one, and will be used here. 



For I = oo, we then have go 



^ p-M(l-A)(^gOxp(-A'A/2)T_ 



1). For a given fi and T, we define y = e t^^/^T, giving 
goo = e-f'T'^iey-lj/j/. The function (e^'-l)/?/^ goes to 
cx) at y = and y = oo. It has a unique point where its 
derivative vanishes, corresponding to a global minimum. 
Thus, on any given interval, the maximum value of (e^ — 
1)/?/^ occurs at one of the endpoints. In particular, this 
implies that goo is maximized for a given /x and T at 
either A = or A = 1. 

To determine whether the optimal A is or 1 for given 
values of /i and T, we note that A = corresponds to 
y = T, while A = 1 corresponds to y = e^^/^T. The min- 
imum value of (e*' — 1)/?/^ occurs before y = 2, hence, once 
g-M/2y -> 2, (e** — 1)/?/^ becomes monotone increasing on 
[e~^/^r, T], so that goo is maximized for A = 0. For hu- 
man cells, the genome length is of the order of 3 x 10^ 
base pairs, giving 3 Q. Therefore, if T > 2e~^^^ ~ 9, 
then optimal preservation of the stem cell line occurs for 
A = 0. Current estimates place the number of adult stem 
cell divisions in the human colon over a human lifetime at 
around 5,000 8J. In our rescaled time coordinates, this 
gives T = 5,000 >> 9. Clearly then, to optimally pre- 
serve the stem cell line, our model indicates that lesion 
repair should be turned off during cell division. 

We should note that, at short times, it is optimal to 
keep A = 1, indepedent of I (this can be shown by ex- 
panding gi out to first-order in T, and optimizing). Also, 
for finite values of ^, it is possible to show that, at suf- 
ficiently long times, the optimal lesion repair efficiency 
can be made arbitrarily close to 1 by making the muta- 
tion rate ^ arbitrarily large. This makes sense, because, 
at high mutation rates, it is necessary to prevent the for- 
mation of more than / lesions during replication, which 
renders the adult stem cell unviable. 

For our purposes, however, the I = oo simplification 
seems appropriate, since it is reasonable to assume that 
/i = 3 is considerably less than the number of mismatches 
which a human adult stem cell can tolerate before becom- 
ing unviable. 

It is important to note that, by lesion repair, we specif- 
ically refer to mismatched base-pairs along the DNA 
chain. The underlying assumption, however, is that each 
of the bases are chosen from one of the four standard 
bases (A, T, G, C). Thus, when considering lesions in this 
model, we are not considering lesions caused by chemical 
modifications of bases, due to, for example, radiation or 
oxidative damage. In principle, these lesions can be cor- 
rectly repaired, assuming that the damage is localized to 
only one of the strands, because the chemical changes to 
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the bases allows the cellular repair mechanisms to deter- 
mine on which strand the lesion is present. 

Thus, in determining that for human stem cells, lesion 
repair should be turned off during cell division, we mean 
that mismatches along the DNA genome should be left 
alone, so as not to risk fixing a mutation in both strands. 

While it is possible that distinct cellular mechanisms 
exist for repairing postreplication mismatches and lesions 
due to DNA damage, it is also possible that both types 
of modifications to a DNA genome are handled by the 
same repair pathways (Nucleotide Excision Repair, for 
instance Thus, it is possible that the way by which 
adult stem cells suppress correction of mismatches along 
the DNA chain is by a general suppression of lesion- 
repair. In this case, adult stem cells should be more 
susceptible to the effects of agents which can damage 
DNA. This increased susceptibility to DNA damage has 
been hypothesized by Cairns [ij , and does indeed appear 
to be a property of adult stem cells i9i. 

V. CONCLUSIONS 

This paper developed a set of ordinary differential 
equations describing the evolutionary dynamics of a pop- 
ulation of adult stem cells. For simplicity, we considered 
stem cell genomes consisting of a single double-stranded 
DNA molecule, i.e., one chromosome. 

We considered two possible mechanisms of chromo- 
some segregation. In the first case, we assumed that 
chromosomes randomly segregate into the adult stem cell 
and undifferentiated tissue cell. In the second case, we 
assumed that the stem cell retains the chromosome con- 
taining the oldest DNA strand of the genome. This co- 
segregation mechanism, termed the immortal strand hy- 
pothesis, was originally proposed by Cairns in 1975 as 
a mechanism by which stem cells preserve the integrity 
of their genomes. 

For the case of random segregation, we derived a set 
of equations analogous to the quasispecies equations for 
semiconservative replication with imperfect lesion repair. 
In particular, the ordered strand pair formalism devel- 
oped in was used. 

For immortal strand co-segregation, we showed that 
an analogous ordered strand pair formalism is possible, 
though in contrast to random segregation, the labelling of 
parent and daughter strands leads to a canonical method 
for constructing an ordered strand pair from a given 
genome. This results in a different set of equations de- 
scribing the dynamics over the space of ordered strand 
pairs. 

Following the approach taken with the semiconserva- 



tive quasispecies equations with imperfect lesion repair 
0, we developed the infinite sequence length equations 
for the stem cell population, assuming a fitness landscape 
defined by a master-genome. From both the random and 
immortal strand equations it is readily shown that im- 
mortal strand segregation with imperfect lesion repair 
helps to maintain a population of stem cells. 

From the infinite sequence length equations, we ob- 
tained the differential equations governing the decay of 
the master genome population, and developed a criterion 
for determining the optimal lesion repair probability for 
maximizing the population of stem cells with the geno- 
type defined by the master genome. Based on parameters 
for human stem cells, we predict that lesion repair should 
be completely turned off in adult human stem cells. This 
result, of course, is in the end a prediction made by a 
highly simplified model, and needs to be experimentally 
tested. Furthermore, because it appears that postrepli- 
cation mismatches and lesions due to DNA damage are 
repaired by the same biochemical pathways 0, future 
research will need to explicitly incorporate DNA damage 
in order to refine our estimate for optimal lesion repair 
efficiency in adult stem cells. Nevertheless, despite the 
simplifying assumptions made in this work, we regard 
this paper as an important first step toward a quantita- 
tive modeling of stem cell evolutionary dynamics. 

In this paper, we assumed that the stem cell and tis- 
sue genomes consist of only one chromosome. While one 
chromosome is sufficient for studying immortal strand 
co-segregation, in reality vertebrate cells contain numer- 
ous chromosomes. Furthermore, it is known that cer- 
tain free living organisms, such as Saccharomyces cere- 
visiae variants (Baker's yeast), segregate chromosomes 
according to the immortal strand mechanism p^ . For 
single-chromosome genomes, the immortal strand mech- 
anism cannot be applied to free living cells, since there 
is no qualitative distinction between the two daughter 
cells (such as "stem" and "tissue"). However, with mul- 
tiple chromosomes, it is possible for asymmetric segre- 
gation to occur so that one of the daughter cells retains 
the chromosomes with the oldest DNA strands. Thus, 
the study of immortal strand co-segregation for multi- 
ple chromosome genomes is an important extension of 
the model presented here and the imperfect lesion repair 
quasispecies equations presented in |7(. 
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